home *** CD-ROM | disk | FTP | other *** search
- INFO.TXT for MPEG Audio Layer-3 Shareware Code
-
- Version 1.44 - 26.4.94
-
- This text is organized as a kind of Mini- FAQ (Frequently Asked
- Questions). It covers several topics:
-
- 1. ISO-MPEG Standard
- 2. MPEG Audio Codec Family ("Layer 1, 2, 3")
- 3. Layer-3 Products
-
- For further comments and questions regarding Layer-3,
- please contact:
-
- layer3@iis.fhg.de
-
- or
-
- Fraunhofer-IIS, Erlangen, Germany, Fax: +49-9131-776-399
-
- For further infos about MPEG, you may also like to contact:
-
- phade@cs.tu-berlin.de
-
-
- 1. ISO-MPEG Standard
-
-
- Q: What is MPEG, exactly?
-
- A: MPEG is the "Moving Picture Experts Group", working under the
- joint direction of the International Standards Organization (ISO)
- and the International Electro-Technical Commission (IEC). This
- group works on standards for the coding of moving pictures and
- associated audio.
-
-
- Q: What is the status of MPEG's work, then? What about MPEG-1, -2,
- and so on?
-
- A: MPEG approaches the growing need for multimedia standards step-by-
- step. Today, three "phases" are defined:
-
- MPEG-1: "Coding of Moving Pictures and Associated Audio for
- Digital Storage Media at up to about 1.5 MBit/s"
-
- Status: International Standard IS-11172, completed in 10.92
-
- MPEG-2: "Generic Coding of Moving Pictures and Associated Audio"
-
- Status: Comittee Draft CD 13818 as found in documents MPEG93 /
- N601, N602, N603 (11.93)
-
- MPEG-3: does no longer exist (has been merged into MPEG-2)
-
- MPEG-4: "Very Low Bitrate Audio-Visual Coding"
-
- Status: Call for Proposals 11.94, Working Draft in 11.96
-
-
- Q: MPEG-1 is ready-for-use. How does the standard look like?
-
- A: MPEG-1 consists of 4 parts:
-
- IS 11172-1: System
- describes synchronization and multiplexing of video and audio
-
- IS 11172-2: Video
- describes compression of non-interlaced video signals
-
- IS 11172-3: Audio
- describes compression of audio signals
-
- CD 11172-4: Compliance Testing
- describes procedures for determining the characteristics of coded
- bitstreams and the decoding porcess and for testing compliance
- with the requirements stated in the other parts
-
-
- Q: How do I get the MPEG documents?
-
- A: You may order it from your national standards body.
- E.g., in Germany, please contact:
- DIN-Beuth Verlag, Auslandsnormen
- Mrs. Niehoff, Burggrafenstr. 6, D-10772 Berlin, Germany
- Phone: 030-2601-2757, Fax: 030-2601-1231
-
-
- 2. MPEG Audio Codec Family ("Layer 1, 2, 3")
-
-
- Q: Talking about MPEG audio coding, I heard a lot about "Layer 1, 2
- and 3". What does it mean, exactly?
-
- A: MPEG-1, IS 11172-3, describes the compression of audio signals
- using high performance perceptual coding schemes. It specifies a
- family of three audio coding schemes, simply called Layer-1,-2,-3,
- with increasing encoder complexity and performance (sound quality
- per bitrate). The three codecs are compatible in a hierarchical
- way, i.e. a Layer-N decoder is able to decode bitstream data
- encoded in Layer-N and all Layers below N (e.g., a Layer-3
- decoder may accept Layer-1,-2 and -3, whereas a Layer-2 decoder
- may accept only Layer-1 and -2.)
-
-
- Q: So we have a family of three audio coding schemes. What does the
- MPEG standard define, exactly?
-
- A: For each Layer, the standard specifies the bitstream format and
- the decoder. To allow for future improvements, it does *not*
- specify the encoder, but an informative chapter gives an example
- for an encoder for each Layer.
-
-
- Q: What have the three audio Layers in common?
-
- A: All Layers use the same basic structure. The coding scheme can be
- described as "perceptual noise shaping" or "perceptual subband /
- transform coding".
-
- The encoder analyzes the spectral components of the audio signal
- by calculating a filterbank or transform and applies a
- psychoacoustic model to estimate the just noticeable noise-
- level. In its quantization and coding stage, the encoder tries
- to allocate the available number of data bits in a way to meet
- both the bitrate and masking requirements.
-
- The decoder is much less complex. Its only task is to synthesize
- an audio signal out of the coded spectral components.
-
- All Layers use the same analysis filterbank (polyphase with 32
- subbands). Layer-3 adds a MDCT transform to increase the frequency
- resolution.
-
- All Layers use the same "header information" in their bitstream,
- to support the hierarchical structure of the standard.
-
- All Layers use a bitstream structure that contains parts that are
- more sensitive to biterrors ("header", "bit allocation",
- "scalefactors", "side information") and parts that are less
- sensitive ("data of spectral components").
-
- All Layers may use 32, 44.1 or 48 kHz sampling frequency.
-
- All Layers are allowed to work with similar bitrates:
- Layer-1: from 32 kbps to 448 kbps
- Layer-2: from 32 kbps to 384 kbps
- Layer-3: from 32 kbps to 320 kbps
-
-
- Q: What are the main differences between the three Layers, from a
- global view?
-
- A: From Layer-1 to Layer-3,
- complexity increases (mainly true for the encoder),
- overall codec delay increases, and
- performance increases (sound quality per bitrate).
-
-
- Q: Which Layer should I use for my application?
-
- A: Good Question. Of course, it depends on all your requirements. But
- as a first approach, you should consider the available bitrate of
- your application as the Layers have been designed to support
- certain areas of bitrates most efficiently, i.e. with a minimum
- drop of sound quality.
-
- Let us look a little closer at the strong domains of each Layer.
- The ISO target bitrates indicate the main areas of optimization
- for each Layer.
-
- Layer-1: Its original ISO target bitrate was 192 kbps per audio
- channel.
-
- Layer-1 is a simplified version of Layer-2. It is most useful for
- bitrates around the "high" bitrates around or above 192 kbps. A
- version of Layer-1 is used as "PASC" with the DCC recorder.
-
- Layer-2: Its original ISO target bitrate was 128 kbps per audio
- channel.
-
- Layer-2 is identical with MUSICAM. It has been designed as trade-
- off between sound quality per bitrate and encoder complexity. It
- is most useful for bitrates around the "medium" bitrates of 128 or
- even 96 kbps per audio channel. The DAB (EU 147) proponents have
- decided to use Layer-2 in the future Digital Audio Broadcasting
- network.
-
- Layer-3: Its original ISO target bitrate was 64 kbps per audio
- channel.
-
- Layer-3 merges the best ideas of MUSICAM and ASPEC. It has been
- designed for best performance at "low" bitrates around 64 kbps or
- even below. The Layer-3 format specifies a set of advanced features
- that all address one goal: to preserve as much sound quality as
- possible even at rather low bitrates. Today, Layer-3 is already in
- use in various telecommunication networks (ISDN, satellite links,
- and so on) and speech announcement systems.
-
-
- Q: So you tell me to consider Layer-3 for my low bitrate
- applications. I have seen equipment working with Layer-2 for low
- bitrates, too. Why should I worry about Layer-3, then?
-
- A: As I told you before, all Layers may be used for low bitrates. So
- you may also apply Layer-2 for low bitrates (e.g. 64 kbps per
- channel). But be careful!
-
- Using Layer-3 for low bitrates means:
-
- - unrivalled sound quality at 64 kbps per channel or below
- - useful for mono as well as for stereo signals
- - full audio bandwidth at 64 or 56 kbps
-
- Furthermore, if you are willing to accept some limitations,
- with Layer-3 you can get the same performance as with Layer-2,
- but at a lower bitrate.
-
-
- Q: Tell me more about sound quality. How do you assess that?
-
- A: Today, there is no alternative to expensive listening tests.
- During the ISO-MPEG-1 process, 3 international listening tests
- have been performed, with a lot of trained listeners, supervised
- by Swedish Radio. They took place in 7.90, 3.91 and 11.91. Another
- international listening test was performed by CCIR, now ITU-R, in
- 92.
-
- All these tests used the "triple stimulus, hidden reference"
- method and the CCIR impairment scale to assess the audio quality.
- The listening sequence is "ABC", with A = original, BC = pair of
- original / coded signal with random sequence, and the listener has
- to evaluate both B and C with a number between 1.0 and 5.0. The
- meaning of these values is:
-
- 5.0 = transparent (this should be the original signal)
- 4.0 = perceptible, but not annoying (first differences noticable)
- 3.0 = slightly annoying
- 2.0 = annoying
- 1.0 = very annoying
-
- With perceptual codecs (like MPEG audio), all traditional
- parameters (like SNR, THD+N, bandwidth) are especially useless.
- Fraunhofer-IIS works on objective quality assessment tools, like
- the NMR meter (Noise-to-Mask-Ratio), too. BTW: If you need more
- informations about NMR, please contact nmr@iis.fhg.de.
-
-
- Q: Now that I know how to assess quality, come on, tell me the
- results of these tests.
-
- A: Well, for details you should study one of those AES papers listed
- below. The main result is that for low bitrates (64 kbps per
- channel), Layer-2 scored always between 2.1 and 2.6, whereas
- Layer-3 scored between 3.6 and 3.8.
-
- This is a significant increase in sound quality, indeed!
- Furthermore, the selection process for critical sound material
- showed that it was rather difficult to find worst-case material
- for Layer-3 whereas it was not so hard to find such items for
- Layer-2.
-
-
- Q. Someone claimed that some international working group on audio
- coding (TG10?) has concluded and that there was some trouble with
- Layer 3, specifically on male voice in the German language. Is
- that correct?
-
- A. One moment, please. The former CCIR has changed its name into ITU-
- Radiocommunication. In 1992, they founded a test group called TG10-
- 2 with the task to prepare the draft for a new recommendation for
- the use of low bitrate audio coding in digital sound broadcasting
- applications.
-
- This test group concluded its work in 10.93. The draft
- recommendation defines three fields of broadcast applications:
-
- a) distribution and contribution links
- (20 kHz bandwidth, no audible impairments with up to 5 cascaded
- codecs)
-
- Recommendation: Layer-2 with 180 kbps per channel (mono or
- one independently coded channel of a stereo-signal); for a single
- distribution link without cascading, Layer-2 with 120 kbps per
- channel
-
- b) emission
- (20 kHz bandwidth)
-
- Recommendation: Layer-2 with 128 kbps per channel (mono or
- one independently coded channel of a stereo-signal)
-
- c) commentary links
- (15 kHz bandwidth)
-
- Recommendation: Layer-3 with 60 kbps for monophonic and 120 kbps
- for stereophonic signals (applying joint-stereo coding)
-
- So these are the recommendations. And again, it nicely fits
- into the above mentioned application profile of MPEG audio: with
- medium bitrates, Layer-2 performs satisfying enough; with really
- low bitrates, you need Layer-3.
-
- The recommendations are based on international listening and
- evaluation tests performed mainly in 1992.
-
- For contribution and distribution, Layer-2 was the only system
- that fulfilled the requirements.
-
- For emission, the codecs had to score at least 4.0 on the CCIR
- impairment scale, even for the most critical material. At 128 kbps
- per channel, AC-2, Layer-2 and Layer-3 fulfilled this requirement,
- and Layer-2 got the recommendation mainly because of its
- "commonality with the distribution and contribution application".
-
- Further tests for emission were performed at 192 kbps joint-stereo
- coding. Layer-3 clearly met the requirements, Layer-2 fulfilled
- them only marginally, with doubts remaining during further tests in
- 1993. Result: *no* recommendation for 192 kbps joint-stero.
-
- For commentary, the quality requirements were for speech
- to be equivalent to 14-bit linear PCM, and for music, some
- perceptible impairments were to be tolerated. In the test in 92
- Layer-3 was by far the only codec that fulfilled these
- requirements (e.g. overall monophonic, it scored 3.6 in contrast to
- Layer-2 at 2.05 - and for male German speech, it scored 4.4 in
- contrast to Layer-2 at 2.4). So there was simply no alternative to
- Layer-3.
-
- Further tests were conducted in 93 using headphones. They showed
- that Layer-3 with monophonic speech (the test item is German male
- voice) at 60 kbps did not fully meet the quality requirements.
-
- Layer-2 was not included in these tests as its low bitrate
- performance was clearly too poor right from the start. Therefore,
- the listeners had no "lower anchor" during the listening test (the
- codec that always gets the "1" and "2" scores) - a fact that
- certainly influences the absolute scoring. Funny enough, the
- same speech signal has been tested in some previous sessions
- without complaints...
-
- The ITU decided to recommend Layer-3 and to include a temporary
- footnote that will be removed as soon as an improved Layer-3 codec
- fulfills their requirements completely, i.e. even with that well-
- known critical male German speech item (for many other speech
- items, Layer-3 has no trouble at all).
-
-
- Q: OK, a Layer-2 codec at low bitrates may sound poor today, but
- couldn't that be improved in the future? I guess you just told me
- before that the encoder is not fixed in the standard.
-
- A: Good thinking! As the sound quality mainly depends on the encoder
- implementation, it is true that there is no such thing as a "Layer-
- N"- quality. So we definitely only know the performance of the
- reference codecs during the international tests. Who knows what
- will happen in the future? What we do know now, is:
-
- Today, Layer-3 already provides a sound quality that comes very
- near to CD quality at 64 kbps per channel. Layer-2 is far away
- from that.
-
- Tomorrow, both Layers may improve. Layer-2 has been designed as a
- trade-off between quality and complexity, so the bitstream format
- allows only limited innovations. In contrast, even the current
- reference Layer-3-codec exploits only a small part of the powerful
- mechanisms inside the Layer-3 bitstream format.
-
-
- Q: All in all, you sound as if anybody should use Layer-3 for low
- bitrates. Why on earth do some vendors still offer only Layer-2
- equipment for these applications?
-
- A: Well, maybe because they started to design and develop their
- system rather early, e.g. in 1990. As Layer-2 is identical with
- MUSICAM, it has been available since summer of 90, at latest. In
- that year, Layer-3 development started and could be successfully
- finished in spring 92. So, for a certain time, vendors could only
- exploit the existing part of the new MPEG standard.
-
- Now the situation has changed. All Layers are available, the
- standard is completed, and new systems need not limit themselves,
- but may capitalize on the full features of MPEG audio.
-
-
- Q: What other topics do I have to keep in mind? Tell me about the
- complexity of Layer-3.
-
- A: Alright. First, we have to separate between decoder and encoder.
-
- For a stereo Layer-3-decoder, our real-time implementations use
- either one DSP32C (AT&T) or one DSP56002 (Mot). For an ASIC,
- Intermetall (ITT) estimated an overhead of around 30 % chip area
- for adding the necessary Layer-3 modules to a Layer-2-decoder. So
- you need not worry too much about decoder complexity.
-
- For a stereo Layer-3-encoder achieving reference quality, our
- current real-time implementations use two DSP32C and two DSP56002.
- But again: as more and more horsepower becomes available on one
- chip, the matter of encoder complexity will decrease.
-
-
- Q: And what about the codec delay?
-
- A: Well, the standard gives some figures of the theoretical minimum
- delay:
- Layer-1: 19 ms (<50 ms)
- Layer-2: 35 ms (100 ms)
- Layer-3: 59 ms (150 ms)
- The practical values are significantly above that. As they depend
- on the implementation, exact figures are hard to give. So the
- figures in brackets are just rough thumb values.
-
- Yes, for some applications, a very short delay is of critical
- importance. E.g. in a feedback link, a reporter can only talk
- intelligibly if the overall delay is below around 10 ms.
- If broadcasters want to apply MPEG audio coding, they have to use
- "N-1" switches in the studio to overcome this problem (or
- appropriate echo-cancellers) - or they have to forget about MPEG
- at all.
-
- But with most applications, these figures are small enough to
- present no extra problem. At least, if one can accept a Layer-2
- delay, one can most likely also accept the higher Layer-3 delay.
-
-
- Q: OK, I am hooked on! Where can I find more technical informations
- about MPEG audio coding, especially about Layer-3?
-
- A: Well, there is a variety of AES papers, e.g.
-
- K. Brandenburg, G. Stoll, ...: "The ISO/MPEG-Audio Codec: A
- Generic Standard for Coding of High Quality Digital Audio", 92nd
- AES, Vienna 1992, pp.3336
-
- E. Eberlein, H. Popp, ...: "Layer-3, a Flexible Coding Standard",
- 94th AES, Berlin 93, pp.3493
-
- K. Brandenburg, G. Zimmer, ...: "Variable Data-Rate Recording on a
- PC Using MPEG-Audio Layer-3", 95th AES, New York 93
-
- B. Grill, J. Herre,... : "Improved MPEG-2 Audio Multi-Channel
- Encoding", 96th AES, Amsterdam 94
-
- And for further informations, please contact layer3@iis.fhg.de...
-
-
- 3. Layer-3 Products
-
- This is a list of available Layer-3 products - disclosed at 1.1.94.
- For further informations, please contact the companies directly.
-
- 3.1. Telecommunication Codecs
-
- a) MusicTAXI Type 3
- The MusicTAXI is a real-time audio codec for the full-duplex
- transmission of mono or stereo audio signals via ISDN. It supports
- Layer-2 and -3.
- Dialog 4 System Engineering GmbH
- Monreposstr. 57
- D-71634 Ludwigsburg, Germany
- Fax +49-7141-22667
-
- b) MAGIC Series
- The Multi Audio-System with Groupable Interfaces and Codecs
- supports Layer-2 and -3 as well as G.722 and G.711. Its
- transmission procedures comply with H.221, H.242 or G.704. The
- codec is a universal device useful in ISDN applications as well as
- in satellite links, LAN or WAN networks or audio memory
- installations.
- PKI Philips Kommunikations Industrie AG
- Thurn-und-Taxis-Str. 14
- D-90411 Nuernberg, Germany
- Fax +49-911-526-6315
-
- c) Zephyr Codec
- The Zephyr is a Layer-3 codec for the transmission of mono or
- stereo audio signals via ISDN, Switch-56 or V.35-networks. It also
- offers a G.722 feedback link.
- Telos Systems
- 2101 Superior Avenue
- Cleveland, OH 44114, USA
- Fax +1-216-241-4103
-
- 3.2. Speech Announcement System
-
- a) DAS VIII HiFi
- This digital speech announcement system for mass transit
- applications applies Layer-3 to use the ROM based speech memory
- most efficiently. Moreover, the system offers an unrivalled sound
- quality at a very competitive price.
- Meister Electronic GmbH
- Koelner Str. 57
- D-51149 Koeln, Germany
- Fax +49-2203-12079
-
- 3.3 PC Boards
-
- a) Layer-3 PC Board
- This full-size PC/AT ISA card is a real-time audio processing
- board. It performs two-channel Layer-3 encoding and decoding,
- depending on the software configuration. The board offers digital
- audio interfaces (AES and IEC) and an additional X.21 interface
- for the reduced data stream. The board is delivered with a library
- of C drivers and a demo programm.
- Audio Export Georg Neumann & Co. GmbH
- Badstr. 14
- D-74072 Heilbronn, Germany
- Fax +49-7131-68790
-
- b) L3-PC-Card
- This PC-Card supports a real-time Layer-3 audio codec. It offers
- digital audio interfaces (AES and IEC) and two additional X.21
- interfaces for one or two reduced data streams. And a decoder-
- only PC card is also available.
- Dialog 4 System Engineering GmbH
- Monreposstr. 57
- D-71634 Ludwigsburg, Germany
- Fax +49-7141-22667
-
- 3.4. ICs
-
- a) ISO-MPEG Decoder Chip MASC 3500
- This MPEG decoder chip offers the use of the full ISO-MPEG-audio
- standard, i.e. Layer-1, -2, and -3. The ASIC is based on the MASC
- DSP family (.8 um) and comes in a small 68 pin PLCC package.
- First samples will be available in 2.Q.94.
- ITT Intermetall GmbH
- Hans-Bunte-Str. 19
- D-79108 Freiburg, Germany
- Fax +49-761-517-880
-
- 3.5. Layer-3 Shareware
-
- The layer 3 shareware is copyright Fraunhofer - IIS 1994
-
- The programms are written for IBM-PCs or Compatibles with MS-Dos.
- While L3ENC.EXE and L3DEC.EXE should work on practically any PC, the
- other programms require a 386 type CPU plus hardware floating point
- support. Especially for the encoder, a 486DX33 or better is
- recommended.
-
- On a 486DX2/66 the performance of the software-only decoder is about
- 33% of the performance necessary for real time audio processing.
- The encoder needs about 30 minutes to encode a 1 minute audio data
- file. These figures assume coding/decoding of stereo audio material
- at 44.1 kHz/sec.
-
- a) via anonymous ftp from fhginfo.fhg.de (153.96.1.4)
-
- You may download our Layer-3 audio software package from the
- directory /pub/layer3. You will find the following files:
- layer3.txt a short description of the files found in layer3.zip
- layer3.zip encoder, decoder, documentation and a sample bitstream
- layer3nb.txt a short description of the files found in layer3nb.zip
- layer3nb.zip encoder, decoder and documentation (no bitstream)
- bitstr.l3 sample bitstream
-
- b) via direct modem download (up to 14.400 bps)
-
- Modem telephone number : +49 911 9933662 Name: FHG
- Packet switching network: (0) 262 45 9110 10290 Name: FHG
- (For the telephone number, replace "+" with your appropriate
- international dial prefix, e.g. "011" for the USA.)
- Follow the menus as desired.
-
- c) via shipment of diskette (only including registration)
-
- You may order a diskette directly from:
-
- Mailbox System Nuernberg (MSN)
- Hanft & Hartmann
- Innerer Kleinreuther Weg 21
- D-90408 Nuernberg
- Germany
-
- Please note: MSN will only ship a diskette if they get paid for the
- registration fee before. The registration fee is 85 Deutsche Mark
- (plus sales tax, if applicable) for one copy of the package. The
- preferred method of payment is via credit card. Currently, they can
- accept VISA, Master Card / Eurocard / Access credit cards.
-
- You may reach MSN also via Internet: msn@iis.fhg.de
- or via Fax: +49 911 9933661
- or via BBS: +49 911 9933662 Name: FHG
- or via X25: 0262 45 9110 10290 Name: FHG
- (e.g. in USA, please replace "+" with "011")
-
- d) via email
-
- You may get our shareware also by a direct request to msn@iis.fhg.de.
- In this case, the shareware is split into about 30 small uuencoded
- parts...
-
- 4. End of INFO.TXT
-